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Abstract 

We review some simple techniques based on monotone mass transport that allow 
to obtain transport-type inequalities for any log-concave probability measure, and for 
more general measures as well. We discuss quantitative forms of these inequalities, 
with application to the Brascamp-Lieb variance inequality. 


1 Introduction 

Throughout the paper we work, when needed, with some fixed scalar product • and Eu¬ 
clidean norm Id on M”. Although our main motivation is to analyse log-concave densities, 
meaning densities of the form e~^ with V convex, our result apply to more general situ¬ 
ations, regardless of the convexity of the potential V. We can often work with a locally 
Lipschitz function V : M” —)■ R with the mild assumption that 

J 1^1^ + I^E dx < -|-oo. (1) 

Actually, when V is convex, we don’t need these assumptions, but not much is lost by 
imposing it. Given such V, we introduced the probability measure py defined by 

duvix) := ^ dx. 

J 

Note that the density is by assumption everywhere strictly positive. 

Following Kantorovich’s idea, given a function c : R”' x R”' —)■ R (one interprets c(x, y) 
as the cost of moving a unit mass from x to ?/ or of bringing back a unit mass from y to 
x), we can define a transportation cost Wc between two Borel probability measnres /i and 
V on R” by 

>Vc(/i, := := inf // c{x,y) dn{x,y) 

J J MX yMX 
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where the inhmum is taken over all probability measures vr on R” x R"" projecting on /i 
and z/, respectively. From the dehnition of W’c(/i, z^) and Fubini’s theorem, we see that it 
suffises that the cost is well dehned on (R" \ X) x R” only, where /i(X) = 0. Under very 
mild hypothesis on c, one can prove that there exists a coupling vr which is optimal, that is 
which achieves the inhmum above (see m Chapters 4 and 5]). The cost c[x,y) = \y — x\^, 
p G [1,+cxo), is used for the dehnition of the L^-Kantorovich-Rubinstein (or Wassertein) 
distance 

Wp{y,u) := (>V|,_,|p(/i,z/))'/^ 

Recall that given two probability measures y and v on R"", the relative entropy of z/ 
with respect to y is dehned by 

H{v\\pi) ■= if dv{x) = f{x)dy{x) with /log+(/) G L^{y) 

I +00 otherwise 


Accordingly, we should only consider probability measures that have a density, in short 
’’absolutely continuous” probability measures. Recall also that the variance of a function 
g G T^(/x) is dehned by 

Var^(^) := j (d - j 9 dfJ^ dp. 

The inequality in the next Proposition appeared in [5] where it was derived in the dual 
form (l5l) as a consequence of the Prekopa-Leindler inequality. By now it is folklore in 
optimal mass transportation theory and known to most specialists. The investigation of 
equality cases seems to be new. 

Proposition 1. Let V : ML ^ M. be locally Lipchitz function satisfying ([T]). Define for 
every y and almost every x in R*^ the (asymmetric) cost 

y) := V{y)-V (x) - VV(x) -(y-x). (2) 

Then, for every (absolutely continuous) probability measure v on R” we have 

y^cyiPY,^) < H{^\\Lv)- (3) 


Moreover, when V is convex, equality holds if and only if u is a translate of pv- 

Remark 2. Regarding the treatment of equality case, we can prove a sharper result. 
Namely, we will establish the following statement: Let V : M^ ^ M be locally Lipchitz 
function satisfying o, and assume that pv has a positive Cheeger constant h{py) > 0 
(see definition below; this is the case when V is convex). Then, there is equality in the 
transport inequality (El) of PropositionUlii and only if V is convex and u is a translate of 
Pv- 

The fact that the convexity of V is necessary for equality cases is reminiscent of the 
equality cases in the Brunn-Minkowski inequality. 
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When V is convex, it is possible to define the cost for every x (and not just for almost 
every x) by using the subgradient dV{x) at x of the convex function V (see [29] for 
background on subgradients): 

dV{x) := {w e M" ; V(x + h) > V{x) + w h, Vh e M"}. 

Indeed, the Proposition can then be stated with the following cost Cy in place of (|2D: 

V(a;, y) G M"" X M", Cv{x,y) := sup lv{y)— V{x) — w ■ {y — x)\. (4) 

w£dV{x) ^ ^ 

Recall that V is locally Lipschitz and so differentiable /i-almost-everywhere. 

Note that when V is convex, we have Cy[x,y) > 0 with Cy(x,x) = 0 in (Ej) and (l4l) ; 
when V is strictly convex, Cy(x,y) > 0 if x y. 

Let us mention that by a simple and standard dualization procedure for transportation 
inequalities (see [21]), the statement of Proposition [1] is equivalent to the following infimal 
convolution inequality: for every (bounded) function g : R"" —)• R, 

where 

Qcvi9)iy) := inf {gix) + Cy{x,y)}. (6) 

We should also mention that transportation cost inequalities of the form stated above 
imply concentration of measure inequalities (for Cy-neighborhoods); we refer to [2l] for 
details. 

The interest of the statement in Proposition [1] resides in the fact that no uniform 
convexity of V is needed. This is reminiscent of the Brascamp-Lieb variance inequality [S] 
(anticipated in different context by Hormander), which states that for a smooth convex 
function R : R” -)■ R with / < +00 we have, for every locally Lipschitz function 

g G 

Var^^( 5 () < j{D‘^V{x))~^Vg{x) ■Vg{x)dg,y{x). (7) 

Since the cost Cy[x, y) in Proposition [1] behaves, when x and y are close to each other, like 
\D‘^V{x){y — x) -{y — x), it follows by a standard linearization argument that Proposition [1] 
implies the Brascamp-Lieb inequality ([7]). We shall recall the argument later. 

Another interesting feature of Proposition [1] is that it is an affinely invariant statement, 
in the sense that it does not depend on the Euclidean structure we put on R"". More 
precisely, we don’t need a scalar product in the statement: the gradient w = Vf{s) (or a 
subgradient) comes from a linear form ^ = df{x) G (R*^)*, and we can use i{y — x) in place 
of w ■ {y — x). This reflects also in the fact that the Brascamp-Lieb inequality ([7]) shares 
the same affine invariance: if <^9 : R” —)■ R” is an (invertible) affine map, then the functions 
= V O and g^ = g o satisfy Var^,,^ (g^) = Var^^( 5 () and 

j{D^V^{x))~^Vg^{x) ■ Vg^{x) dg,y^{x) = j{D'^V{x))~^Vg{x) ■ Vg{x) dg,y{x). 
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Other consequences of Proposition [T] are Talagrand’s transportation inequalities for 
Gaussian like measures. Observe that for the standard Gaussian measure 7 , when V (x) = 
|xp/ 2 , we have 

cv{x,y) = \y-x\^/2 

and the inequality becomes exactly Talagrand’s inequality [30]: for every probability den¬ 
sity 1 / on M”, 

i^7(7,0</^Ml7). (8) 

with equality if and only if z/ is a translate of 7 . More generally, if V is with D'^V > A Id 
on M” for some A > 0, then by second-order Taylor expansion we see that the cost satishes 

cv{x,y) > X\y - x|V2, 


and therefore we deduce that in this case, for every probability measure u on we have 

< H^p\\^Lv)■ (9) 

This inequality appeared in [5] EH] [3] . We refer to |2ll |T^ for background and references 
on transportation inequalities. 

The proof of Proposition [1] is very short; it is a minor adaptation of the transportation 
proof of Talagrand’s inequality ([9|) given in [10] . With a little more effort one can actually 
prove a quantitative form of the inequality involving a remainder term. To state the result, 
we need some notation. Given a probability measure y on M"', we denote by h{y) the best 
(i.e. largest) nonnegative constant for which the inequality 




g dy 


dy{x) < 


|V 5 f| dy 


( 10 ) 


holds for every smooth enough g G L^{y)- This constant, up to a factor 2, is also known 
as the Gheeger isoperimetric constant. 

When fi is log-concave, then it is known that h{y) > 0, and h(/r)^ is actually equivalent, 
up to an universal constant, to the spectral gap of the Laplacian associated to y (or 
equivalently the inverse of the Poincare constant). More explicitly, if we denote by \{y) 
the best nonnegative constant for which the inequality 


\{y) 




dy{x) < 


\Vg\‘^dy 


holds for every smooth enough g G L‘^{y), then when y is a log-concave measure on M”, 
we have 

ch{yy < A(/i) < C h{yy (11) 

for some universal (numerical) constants c, G > 0 , independent of y and n; see [221 ES]- 
In the rest of the paper, we will adopt the lazy but convenient tradition from asymptotic 
functional analysis to call ” a numerical constant c” any positive constant larger than 2 or 
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smaller than 1/2 (c may even vary from line to line). So a nnmerical constant refers to a 
nniversal constant (in particular it does not depend on n, V, /i, u, etc.) whose exact value 
is irrelevant but who could a priori be computed explicitly. 

There is another natural cost function associated to any measure having a positive 
Cheeger constant, namely the cost min (h(p)^||/ — = ^i.Ky)\y - 

where 

Vf > 0, ■A/’(f) ;= min(f^, t). ( 12 ) 

This cost -in this form or in some equivalent form- has been studied by several authors 
(see again EHHB] for details). 

Since equality holds in Proposition [T] when i/ is a translate of p it is natural, if we want 
a remainder term, to minimize over translations, or equivalently, to impose some centering. 
The main result of this note is the following Theorem. 

Theorem 3 (General transport inequality with a remainder term). Let G : M” —?• M 6e a 
locally Lipschitz function satisfying ([T]) and let Cy be the cost defined by ([ 2 ]). Introduce the 
cost 

cvix,y) = Cv{x,y) + cAf{h{iav)\y - x\) 

where c> is an appropriate numerical constant. 

Then, for every probability measure v on R”' such that J xdiz = f x dpiy we have 


y^cviyv,'!^) < H{v\\p.v). (13) 

As a consequence, we have the quantitative version of Proposition\^when J xdu = f xdpiy: 

H{iy\\iJ,v)> Wcy (/iy , v) + c W^f(h{^,v)\y-x\) ih-v, i^) (14) 

and in particular, we also have, 

H{v\\^v) > Wcyiiiv,^) -hcmin|h(py)^fT/, (py,z/),h(/iy)lTi(py,y)| (15) 


Note that unlike the quantities H and Wcy, the cost A/jV{h(iiy)\y-x\) is very much de¬ 
pendent on the scalar product, which should therefore be chosen with care. 

Let us explain how the consequences of flT^ stated in the Theorem are obtained. The 
hrst one (ITT)) follows from a general and straightforward principle: given two costs Ci, C 2 we 
always have >Vci+c 2 (') •) > bVci (•) ■)+i^c 2 (‘) ')• Th® ”in particular”, may seem more dubious. 
The reason that flTHl) follows indeed from flTTl) is that, up to numerical constants (see below) 
we can replace the function A/{s) = min(s^, s) by a convex increasing function J^(s), and 
then we can invoque Jensen’s inequality to ensure that >Vj-(|y_a;|)(y,/i) > J^(yV\y-x\ip^, tz))- 
Note however that the form flT^ is strictly weaker than the forms flT^ and flTT)) . In 
particular, we should note that the cost Af{h{fiv)\y — x\) behaves like h{p,v)‘^\y — x\'^ when x 
and y are close to each other, and this behavior is well adapted to linearization procedures. 

Let us describe some consequences of Theorem [3] in the case where V is convex. Applied 
to Gaussian type measures, when Cy(x,y) > A|a; — y\‘^/2, it amounts to a quantitative 
version of the transport inequality fl9l) . 
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Proposition 4 (Gaussian type transport with a remainder). Let V : M” M. be a C'^ 
convex function with D^V > Aid on M” for some \ > t) (we have mainly in mind the 
Gaussian measure, for which A = 1/ Then, for every probability measure v on M” such 
that J xdu = J X dpiv we have 

H{v\\Lv)> cWx^h{^,v)\y-x\){Tv,v) (16) 

> cmin|h(/iy)^Phi^ (/xv',«^),h(/iy)IPi(/iv',i^)} (17) 

for some numerical constants c, c > 0. One can also replace h{pi,vY by A since h{piyY > c' A 
for some numerical constant c' > 0. 

Next, linearization of the inequality 0131) in Theorem [3] leads to a reinforced Brascamp- 
Lieb inequality in the case of centered functions. 

Proposition 5. Let P : M” — )■ R 6e a convex function with J e~^ < +cxd. For every 
locally Lipschitz function g G L‘^{p,v) with 

j x{g{x) - j gdnv) diiv{x) = 0 

we have 

Var^^(5() < j [T)V +ch(/iv')^Id]"V5f ■ Vfifd/iy, 

for some numerical constant c > 0 . We can replace h{piyY by X{fiy), in view of ffTTj) . 

One can derive a similar result using Hormander’s L^-method, but with a different 
centering condition of the form f'Vgdpiy = 0 (see [ 2 ]). 

We should add, as apparent from the proof, that the convexity of V is not really needed 
in Propostion O The correct assumption is that D^V + ch{pLy)‘^{fiy) Id is nonnegative. In 
particular, the result applies to perturbed log-concave measures, provided h{piy) > 0. 

Equality cases in the Brascamp-Lieb inequality d?]) are given, exactly, by the functions 
g of the form 

g{x) = WV{x) ■ vq + cq (18) 

with vq G R"" and cq G R. In order to have a nice quantitative version, one would like to 
get rid of the centering assumption and to measure, in some form, a ’’distance” 

inf d{g, VP • vq -f cq) 

VO, CO 

to the set of extremizers flTSD . Here is an attempt. 

Proposition 6 (Brascamp-Lieb inequality with a remainder term). Let V : R"^ —)■ R 
be a convex function with f e~^ < -|-cx). Then for every locally Lipschitz function 
g G L‘^{p,y), if we denote 


9o{x) 


g{x) - W{x) ■ Vo - Co, 


Co := 


Jg dfiy and Vo := 


j vigiv) -cp)dfry{y), 
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we have 


j (L> V (x)) ^Vg{x) ■ Vg{x) dg,v{x) - Var^^ {g) 

> cA(/iy) j (DV)-'(DV + cXifiv) ld)-^Vgo ■ Vgo dfw 

where c> Q is a numerical constant. As a consequence, if we denote by Amax(a^) the largest 
eigenvalue of the nonnegative operator D‘^V{x), we have 

/(DV(i))-‘Vj(i) ■ - Var„,(j) >- . , ■ [ hP d/iy 

J sup^ Amax(a:) + cA(/iy) J 

and 

[{D^V{x)y^Vg{x)-Vg{x) dfivix)-y&T^^{g) > y-- ^—( / \go\d^iv) 

J J Amax(Amax + cA(/iy)) d/iy V J / 

where c > 0 is a numerical constant. 

Let us recall that A(/iy) can be estimated by A(/iy) > ^ f —, where Amin(a;) denotes 

the lowest eigenvalue of the nonnegative operator D^V(x) and c > 0 is a numerical constant 
(see [311E7]). Therefore, the constant in the previous Proposition (which is an increasing 
function of A(/iy)) can be lower bounded by some integrals of Amin and Amax with respect 
to /iy. For instance, using the previous bound and the fact (f A~l^djUv)~^ < / Amaxdpy 
we hnd 

_ cMmv)^ _^^_ 

J" '^max('^max “ 1 “ C A (py)) dpy (J dg,v) ^ f A^^x 

for some numerical constant c > 0. This might provide computable a constant beyond the 
easy case where A < < R on R^. 

We conclude this introduction with some bibliographical comments. Part of the present 
note is rather elementary, and many arguments are known to specialists in mass transport, 
some having appeared implicitly or explicitly in recent or older works. For instance, we 
already said that Proposition [1] was folklore in the theory, and while writing these notes 
we heard about the work of Bolley, Gentil and Guillin [7] which contains an analogue, in 
a less straightforward form, of the inequality of Proposition [T] together with its connection 
to the Brascamp-Lieb inequality. If we go back in time, the idea of using the remainder 
term in the transportation proof of [ 10 ] appears, in the case of dimension one, in the paper 
by Barthe and Kolesnikov [1]. Similar arguments in higher dimensions for unconditional 
measures were recently used in [ 20 ] and in a form very close to the one used here in [T 2 ] . 
Mass transport arguments combined with Poincare inequalities (of different nature than the 
one we use) were put forward to exhibit remainder terms in isoperimetric type inequalities 
in the far-reaching work of Figalli, Maggi and Pratelli, in particular in [T5| for the case of 
log-concave measures (or rather convex sets). Our treatment is in part very close to the 
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recent work of Fathi, Indrei and Ledoux [H] were the mass transport remainder term is 
combined with a Poincare inequality in order to get a bound on the dehcit for Talagrand’s 
inequality (j9]) in the case of the Gaussian measure (they have also similar, but deeper, 
arguments for the log-Sobolev inequality, a case that was also considered in [ 6 ]). 

The quantitative transport inequality obtained by Fathi, Indrei and Ledoux HI for the 
standard Gaussian measure /i = 7 on R” (a case where A(/i) = 1) is as follows: for any 
probability measure v with f xdiy(x) = f xd'y(x) = 0 , 




^IF 2 ^( 7 , z/) > cmin 




n 




where IFi^i := W\\x-y\\i with ||x —j/||i ;= Ym=i If compare with inequality flTTl) 

in Proposition H] above applied to the Gaussian measure /i = 7 , we see that our result is 
formally stronger, since 


ITi(/i,z/)> 


\/n 


Actually, our bound is signihcantly better in many cases, but both bounds are ’’equally 
bad” when z/ is a product of centered measures being at a ’’large” distance from the one¬ 
dimensional Gaussian, since in this case one expects a remainder of order n and both results 
give something of order y/n (on the other hand, it is not clear to us that this situation is 
the most relevant one). 

Regarding quantitive versions of the variance Brascamp-Lieb inequality, Harge |T9] (by 
an L^-method) and recently Bolley, Gentil and Guillin [7] (by linearization of a transport 
inequality) obtained a remainder term which is, up to constants depending on /xy, of the 
form 


gV duv ~ 9 ^ d/iv'^ 


Rv{g) 


Note that, unlike the remainder term in Proposition [ 6 l this term Rv{g) does not vanish 
only for extremizers. For instance Rv{g) is zero if V is even and g odd. Actually, the 
space where Ry vanishes is of co-dimension one in L‘^(/j,v) whereas extremizers (IT 8 |l form 
a (n 1 ) dimensional subspace. Of course, it could be that such type of remainder is 
nonetheless sometimes better and more useful than the one we obtained. Bolley, Gentil 
and Guillin also derive, in the same work [7] but by a different method (namely by lin¬ 
earization of a functional Brunn-Minkowski inequality), a second quantitive form of the 
variance Brascamp-Lieb inequality with a remainder term that vanishes exactly for the 
extremizers dig, as expected. This remainder however is not an or distance to the 
space of extremizers, and so the comparison with the result of our Proposition E] is not 
clear to us. 


The plan of the paper is as follows. In the next section we prove Proposition [T] and The¬ 
orem [3l For this we recall some tools from the Brenier-McGann monotone mass transport 
theory, and prove a general lower bound for the remainder term (Lemma [HI) that might be 
of independent interest. Then, we prove Proposition [5] and Proposition [ 6 l 





We would like to thank Bernard Maurey for useful observations on our manuscript, and 
the anonymous referee for several insightful questions and observations that led to improved 
statements. We would like also to express our deep gratitude to Emanuel Milman for his 
most sharp reading of the hrst version of our preprint. He pointed out to us a fatal mistake 
in the use we made of his results; this led us to rewrite the main statements and their 
proofs. 

2 Mass transport, minoration of the remainder and 
proofs of Proposition [1] and Theorem [3] 

The proof of theorems [Hand [3] use monotone transportation of measure in the spirit of [TO] . 

Given two probability measures fi and u on M”' with densities F and G, respectively, 
we know from Brenier [9] and McCann [23] that there exists a convex function -0 such that 
the map V'0 pushes forward /i onto u. By the simple but useful weak-regularity theory of 
McCann [21| we have, for /i-almost any x, 

F{x) = G^Viipix)) det (19) 

Here D‘^'ip{x) stands for the Hessian of the convex function 'ip in the sense of Aleksandrov, 
that exists almost everywhere. There are several ways to use this equation to prove our 
inequalities. One can use the McCann weak theory of change of variables [23], as in [lOj . 
The advantage is that it relies on simple arguments in convexity and Lebesgue measure 
theory. Alternatively, one can use results on the regularity of Monge-Ampere equation, 
in the spirit of those obtained by Caffarelli. This relies on more difficult and deeper 
arguments. However, partial regularity results for solutions of Monge-Ampere have been 
simplihed and extended recently, and we shall favor this point of view. 

Let us assume that /i and v are supported on the whole M”, and that the densities 
are continuous and strictly positive (so locally bounded above and away from zero). Since 
the support of the target measure is convex (here M*^), one can prove that the convex 
function pj solves the Monge-Ampere equation also in the sense of Aleksandrov (see 
the argument given in the proof of [131 Theorem 3.3], and by the assumption above on the 
densities, the local regularity of [25], say, applies. In particular, pj is IT[q^^(M"^). 

To prove the transport inequality of Proposition [1] for dfiy = dx, we assume 

that dv = f{x)dfiv{.x). It is sufficient to prove the inequalities in Proposition [1] and 
Theorem [3] in the case where / is continuous and strictly positive on M”, so that the 
previous assumptions are satished. We can also assume that u has second moment. 

We introduce the Brenier map T = Vip between fiy and u. We have that ifj G IT]qj,^(M”) 
and that almost everywhere 

g-v'(a:) ^ f(T{x))e~^^^^^^^ det D‘^-ifj{x). 

It is convenient to introduce the displacement V9{x) =T[x) —x = V'0(a;) —x (i.e. 6{x) := 
pj^x) — |xp/2). If we take the log in the previous equation and introduce Cv{x,T{x)) = 
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V{T{x)) - V{x) - VV{x) ■ ve{x), we find 

log(/(T(a;))) - c{x, T{x)) = VV{x) ■ Ve{x) - log det D^ij{x) 

= VV -ve - Ae + Ae - log det (id 

We integrate with respect to fiy- Noticing that f \og{f oT)dfiv = f f ^og(f)djuv, we have 
Cv(x,T(x))d/uv = j \^V-Ve-Ae]d^iv + j [AO-logdet{ld+D^e)] dfiy 

The hrst term in the right-hand side vanishes after integration by parts, thanks to the 
integrability assumptions we have made. Let us justify this. 


Fact 7. 


AOe-^ = I Ve-VVe 


-V 


Proof. Since J {Ve~^ = J du{y) and u as second moment, we have in view of our 
assumptions that J < -|-cxd and / |V6'| e~^ < -|-cxd. Let h be function on R”, 

with values on [0,1], that is compactly supported and is identically one in a neighborhood 
of 0 G R”. Introduce the sequence hk{x) = h{x/k). We have 0 < hfc < 1, hk{x) f 1 for 
every a; G R'^ and ||Vhfc||oo —)■ 0 as /c —)■ -|-cxd. We have 




j hkAOe-^ = - J Whk-Wee-^ + j hjS/e-WVe 

For the left-hand side, we wan write 

j hkAO e~^ = J hkAip e~^ — n J hke~^ 

and each term converges using the monotone convergence theorem (since Aijj > 0), giving 


/ 


,-v 


n 


f e ^ = f {Afj — n)e ^ = J A9e The hrst term in the right-hand side 


tends to zero, since it is bounded by ||Vhfc||oo/|V6'|e For the last term, we conclude 
by using the dominated convergence theorem, since 


\V9-VV\e-^< l\V9\'^e-''+ / |VF|"e”"' <+oo. 


2 


|2 


□ 


So we have arrived to following elementary formula: 

H{u\\fxv) = J Cv{x,T{x)) dfj,v{x) + ^ [A6* - log det (Idd/iy 

= J Cy{x,T{x)) dyv{x) + J [tr D‘^9 — tY(\og(ld+D‘^9))~\ dyv 
= [ Cv{x,T{x)) dyv{x) + f tr{J'{D‘^9)) dyv, 


( 20 ) 
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where : [— l,+cx)[— )■ [ 0 ,+oo[ stands for the convex (increasing on M+) fnnction defined 
by 

J'(f) := f — log(l + f), teW^. ( 21 ) 

Since by dehnition f Cy[x^T[x)) ^ and > 0, we have proved in 

particular the inequality in Proposition [H 

The treatment of the cases of equality in Proposition [1] requires a bit of extra work (in 

particular since T was not a priori the Cy-optimal map); we postpone it to the end of the 

present section and go on with the proof of Theorem [3l 

In order to prove Theorem [3], we have to play a bit with second term in the right- 

hand side of (1201) . as it is done in the works we mentioned in the introduction. Indeed, a 

“remainder” term of this form appears in several mass transport proofs (for instance [T5l 

2 

[HEQiiniiii]), sometimes in equivalent forms such as + Tfjy] “ 1 ) X] TTfry 

Si refer to the eigenvalues of D^9). Anyway, the crucial property of the these functions and 
of the convex function t — log(l -|- t) is that it behaves like for t close to zero, and like 
t for t large. More precisely, we have for every f g] — 1, -|-cxo[ that F{t) > T’(ltl) and that 
every s > 0 

^ min(s^, s) < iF{s) < min(s^, s). (22) 

But we hnd it more convenient to work with the convex function -^(1^1) rather than with 
7\A(|t|) = min(t2, |t|). 

The treatment of the remainder term is stated in the next, central. Lemma, which is 
of independent interest. 

Lemma 8. Let n be a probability mesure on M” absolutely continuous with respect to the 
Lebesgue measure and 6 G 1Tjqj.^(M") with D‘^6 > — Id almost everywhere. We assume that 
|V6*| & L^{ij,) with J'V9dfi = 0. Then, 

j ii{F{D^e)) dfi>cj F{h{fr)\ve\) dp. (23) 

for some numerical constant c > 0 . 

Note that our assumption f xdpy = Jxdu rewrites as f'VOdpy = 0 so if we use 
in fl2nll the previous Lemma with p = py and 9 our displacement function, we hnd 


H{v\\py) > 


Cy{x,T{x)) dpy > Wcv{py,u) 


as claimed in Theorem [3l 

So it only remains to prove the previous Lemma. 
Denote by a the uniform probability measure on 
X G M” we have 


n 


gr,-l 


{X ■ uY da{u) 


|A'P 


Recall that for every vector 
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and that 


c\X\ < -Jn 


\X ■ u\ da{u) < |X|, 


for some numerical constant c > 0. 

We will use the following Fact, the proof of which is postponed below. 
Fact 9. Let A be a symmetric matrix with eigenvalues > — 1. Then 

1 i 


tr(^(A)) > 


X{y/n\Au'\) da{u). 


(24) 


We will combine this with the following isoperimetric type inequality. It is due to 
Bobkov and Houdre |1], where it is stated with the median. We will include a proof below 
for completeness. 


Fact 10 (Bobkov-Houdre). Let be a probability measure on 1 
/ : M” —)■ M we have 

J J^{\f-Ifdiv\)dfx<c j f\^d^i 

for some numerical constant c > 0 (for instance c = 3 x 4^ works). 


For every regular enough 


(25) 


With these two facts in hand, we can now hnish the proof of (|23il . We have, by Fact |9] 
and Fubini’s theorem, that 


dfi>- 


is"-r 



n\D‘^6u\) dpi da{u) 


(26) 


For any hxed vector u G S"‘~^ we have that the function g{x) = h{fi)y/nX9{x) ■ u is 
^ioc(^"’) with derivative Vg{x) = h{pi)y/n {D‘^6{x))u, and J g dpiy = 0. So we have by 
applying Fact HU] that 


x[Vfi\{D‘^e)u\)dpi> 


3 X 43 


iF{h{pi)y/n IVO ■ m|) dp,. 


Back to fl26|) . integrating the previous inequality with respect to da{u), using that X is 
convex and fl24ll we hnd 


dp > 


1 


3 X 43 X 8 


x(h{p)^/n / \X9-u\da{u))dp > c / X{h{p)\X9\) dp. 


This ends the proof of Lemma (H] modulo the two facts above that we prove now. 

Proof of Fact 0 Let us collect hrst some straightforward properties of X, or equivalently, in 
view of (]22|l . of min(s^, |s|). These functions commute with power functions. In particular, 
we shall use that 


Vs > 0, X{.fs) < V'7(i) < 2J^(x/i). 


(27) 
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Note that J^(2s) < 4J^(s). Observe also that for a hnite family si,..., > 0 we have 





Indeed, if we denote by Smax the largest number, we see using (122|1 that 




4 ^ min(sj, sj) > - min(s„„, si„) E 

i<k i<k 


j: 


( 28 ) 


Then, distinguishing between Smax < 1, and Smax > 1, a case for which we replace it by 
using Smax < ^Jj2i<k ^6 hud 

E^W > jmm(E4 ,/^) S 

i<k i<k y i<k V i<k 

Let us mention that inequality fl28|l will be the one responsible for the loss of a factor y/n 
in the case of product measures. 

Back to the proof of the Fact, let us notice that J^{A) > since -F(sj) > J^(|sj|) 

for any eigenvalue Si of A. Denote 

H := |kl| = vCdM; 

it is a nonnegative symmetric matrix. Let (ci,..., e„) be an orthonormal basis of eigen¬ 
vectors of A. Then 

a{HH)) = E ^ E 

i<n i<n i<n 


where we used m- Let us mention in passing that using the convexity of fF, we can 
establish more generally that for for any v G we have \J^{H)v\ > 

From this, the fact that \/^ is concave on M’*', then again (HT^i and hnally (|28|) we hnd 


i<n i<n ^ 

> - / \f^{n{Hei ■ uY) ds{u) > - f 'S^fF{y/n\Hei-u\)ds{u) 

2d lon — l 2 I Qn — 1 

i<n i<n 

-f/ ds(M) = ^ /” T{y/n\Hu\) ds{u) 

8 Jsn-1 ^ y .<„ ^ » Js-1 


To conclude, use that = H^u ■ u = A^u ■ u = \A 


u\ 


□ 
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Proof of Fact [13 By scaling the metric, we can assume that h(/i) = 1. More precisely, we 
can change the scalar produce x-y into h{y,)~^x-y, which changes the gradient accordingly 
in (Ho]) and fl^ . 

Denote by m a p-median of /. By a standard argument, it is enough to prove that 



dy < 3 X 4f 


H\^f\)dy. 


Indeed, since J^{2t) < 4J^(t) and since is convex increasing on M+, we have for any 
function g with /i-median 


d^[\9- S9dy\)dy<2 / J^{\g-mg\)dy + 2J^{\j\g-mg)dy\) <4 / J^{\g - mg\) dy. 


The same kind of argument shows that one can use a median rrig instead of the mean, in 
the dehnition flTOl) . Indeed, for any g G L^{y) with median nig we have 


J \9 - mgl dy < 


g — Jg dy \ dy+ \Jg dy — nig 


We can assume that nig > Jg dy (otherwise use —g), and by the dehnition of nig and by 
Markov’s inequality we have 


1 

2 


< y{{9 > mg}) < y{{\g - Jgdy\ > mg 


hdy}) < 


mg 


1 

- Iddy 


9 - J9dy\ dy, 


and so \mg — J 9 dy\ <2 J \g — J g dy \ dy. Therefore, we have 


J \9 ~ mg \ dy <3 


g — Jg dy \ dy < 3 


|V 5 f| dy. 


Given our / with /i-median m, let us introduce the (continuous) function g such that 

^ if/(a;)>m 

^ }—J^[\f{x)—m\) if f{x)<m. 

Since fF > 0, the function g has zero /i-median. Therefore 

J J^ilf -m\)dy = J \g\dy<3 J \Vg\dy. (29) 

We will now use an argument inspired by |20]. Let us observe that for every s G M"*", 
t G [0,1] (this is only good choice to estimate the Legendre transform of fF), we have 

st < 4:fF{s) H- 
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Indeed, for s > 1 the inequality is obvious since 4J^(s) > s > st, and for s < 1 use that 
4J^(s) > to complete the square. Since J^' G [0,1] on M+, we have 

j\Vg\di, = j T'{\f-m\)\Vf\dg<4, j T(\Vf\)dg + ^j T'{\f-m\fdg 

<4 j jr(|v/|)4,i+i jT{\f-m\)df,, 

where the second inequality follows from < 4J^(s) for every s G M"*" (this can can be 

seen, for instance, by computing (4J^ — ^ > 0). Plugging this in fl2^ 

we hnd 

^ j ^(1/ - m|) d/r < 3 X 4 y ^(|V/|) d/r, 

which gives the desired inequality. □ 

This completes the proof of Lemma [8] and Theorem [3l It only remains to treat the 
cases of equality in Proposition [1] 

Determination of equality cases in PropositionU\ . The idea is that, if equality holds in 
Proposition [H and if u and py have same barycenter, a situation that can be imposed by 
translating u (provided we know that translation preserves equality cases), then we can 
apply Theorem [3] and conclude that ITi(/iv', = 0, which implies u = /iy. Oddly enough, 
the converse requires also some work ; even the fact that there is equality when u = /iy is 
not straightforward, and actually requires the convexity of V. 

We will prove the stronger result of Remark [21 Given a vector n G M", let us denote 
by the translation by v of the probability z/; if du^x) = F[x) dx , then dT^v^x) = 
F{x — v) dx. The following Lemma is essential as it establishes the translation invariance 
of the inequality under study. 

Lemma 11 (Translation invariance). With the notation of Proposition[^ we have, for any 
probability v and any vector v G M"', that 

id(T^z/||/iy) - >Vc^(/iy,T^z/) = id(z/||/ry) - yVcyiiavW) 

Proof. To simplify the notation, we can assume that J e~^ = 1, and also that dv{x) = 
f{x)dpiv{x) = f{x)e~^^^'>dx. To treat to transportation term, we will need the following 
observation: 

Fact 12. Given v G introduce v = (0,n) G Let p and v be two probability 

measures on and Cy be the cost from PropositionUl If n is a Cy-optimal coupling for 
(/i, z/) then TjjTT is a Cy-optimal coupling for (p,T^z/). 

Let us prove this fact. The coupling condition is clear, so we only need to check 
that PfjTT is Cy-optimal when tt is. Equivalently, by the characterization of optimality in 
terms of cyclical monotony (see [221 Chapter 5]), it suffices to check that the support 
of PyTi is Cy-cyclically monotone when the support of vr is Cy-cyclically monotone. Let 
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, {xk,yk) be arbitrary points of with the convention that {xk+i,yk+i) ■= 
{xi,yi). We have 


k k 

^ Cvixi, yi+i + v)-^ Cvixi, yi + v) 

i=l i=l 


k 

- X] Vf/ {Xi) ■ {yi+i - yi) 

i=l 

k k 

Cv{xi, 2/i+i) - Y 

i=l i=l 


which shows that, indeed, the support of T^vr is Cy-cyclically monotone if and only if the 
support of 71 is Cy-cyclically monotone. 

With this Fact in hand, let us hnish the proof of Lemma [TTJ Let vr be a Cy-optimal 
coupling for (py, u). Then by the previous Fact we have that 


>Vc^(/iy,T^z/)->Vc^(/iy,z/) = Jj [cv{x, y+v)-Cv{x, y)] dTr{x) = j \V{y+v)-V{y)]dv{y), 

where we used that // W(x) ■ v d7r{x, y) = J 'VV(x) ■ v dx = 0 . 

The entropic terms are easier to analyse. Since dTyu{x) = f{x — dx = 

f{x — we have 


//(T^i/||/iy) - iL(i/||/iy) = J [\og{f{x-v))-V{x-v)+ V{x)]f{x-v)e dx 

— J \ogf{x) \og{f{x))e~^^^^ dx = j [ — V{x) + V{x + v)]dh'{x). 

By subtracting the previous two equations, we obtain the conclusion of Lemma [11] □ 

Next, the role of the convexity of V can be summarized as follows. 

Lemma 13. Let 1/ : M"' —)■ M 6e locally Lipchitz function satisfying ([T]) and Cy be the cost 
given by (Ej), which is well defined for almost every a; G M”. If there exists an absolutely 
continuous probability measures /i with support R” such that Wcy{y, p) = 0, then V is con¬ 
vex on R”. Conversely, ifV is convex, then Wcy{p,p) = 0 for every absolutely continuous 
probability measure p. 

Proof. Since c(a;, x) = 0, if Wcip, p) = 0 then it means that the image vr of p by the map 
X {x, x) is an optimal coupling, and therefore its support is Cy-cyclically monotone. 
By the assumption on p, this implies that for (almost) all x,y we have Cy[x,y) + 

Cy{y,x) > Cy{x,x) + Cy{y,y), which rewrites as 

{VV{y) - VV{x)) ■{y-x)>0. (30) 

For a locally Lipschitz function (therefore also ^bis property implies that V is convex. 

Indeed, we can consider = V * rj^, where ri,:{x) = e~'^r]{x/e) is an approximation of the 
identity in R"", with rj compactly supported. Then the property (l30|) passes to K, which is 
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now smooth, so that this property holds at every (x, y) and this implies that K is convex 
on M", because it implies that the restriction of K to any affine line has a nondecreasing 
derivative. We conclude by using that K converges to V, point-wise as e —)■ 0. 

Conversely, the fact that Cy(x,x) = 0 implies that >Vc^(/i,/i) < 0 for any absolutely 
continuous probability measure y. Since Cy > 0 when V is convex, we get in this case that 
Wcy{y,y) = 0. (One can also verify that when V is convex, the set {(x,x) ; x G M”'} is 
Cy-cyclically monotone.) 


□ 


We now have all the ingredients for the study of equality cases. If there is equality 
in (|3]) for some z/, then by Lemma [11] there must be equality for any translated measure 
TyU, V G M"". But ioT V := — J X diy + f xdfiy we have the centering condition f xdTyU = 
f xdjuy(x), and so we must have that = 0, that is T^z/ = juy or equivalently 

ly = T-y^y. This shows that for equality to hold, z/ must be a translate of yy. But 
this in turn implies, again by Lemma [TTl that there is also equality for u = fiy. Since 
H{yy\\yy) = 0, we must have Wcy{yv, = 0. By Lemma [T3| this implies that V is 
convex. 

Conversely, if V is convex, there is equality for u = y, because Lemma [12] ensures that 
Wcyiyv, fJ'v) = 0, and by Lemma [TT] we then have also equality for any translate of yy. 


□ 


3 Variance Brascamp-Lieb inequalities 

It is well known that linearization of transportation type inequalities give Poincare type 
inequalities. One often uses the dual inhmal convolution inequality (ED to perform the 
linearization, but one can do it also directly from the transportation inequality. The 
procedure for linearizing the Wasserstein distance is standard, especially in the framework 
of the so-called “Otto calculus” (see for instance [28]). It is also known that only the local 
behavior of the cost matters for linearizing a transport inequality (see for instance [T51 
Section 8.3]. However we did not hnd a reference for the precise situation studied here, 
and so we include for completeness the following statement. 

Lemma 14. Let c : R" x M" —)■ R+ be a function such that c(y,y) = 0 and c(x,y) > 
(5o|x — for every x,y E R”, for some 5o > 0. Assume furthermore that for every y there 
exists a nonnegative symmetric operator Hy for which 



uniformly in y on compact sets when h —)■ 0. 

Then, if y is a probability measure on R” and g is a compactly supported function 


with J g dp, = 0, we have 


liminf \Wc{p, (1 + eg) dp) > 

fT—i.n 


£->■0 e- 


1 {jgfdTf 

2jH-^Vf-Vfdp 
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for any compactly supported function f. 

Proof. Given a (bounded) function F on R”', we introduce its infimal convolution (1^ 
associated to our cost c which satishes: for every {x^y) e R”' x R”', Qc{F){y) — F{x) < 
c[x,y). It then follows from the dehnition of Wc that 

> j Qc{F) du- j Fdfi. 

In our situation where v = {1 + eg) dfi, {e small enough and later tending to 0) we pick 
F = ef with / of class compactly supported and / f dp. = 0. Let us write 

Qc{£f){y) = inf {efix) + c{x,y)} = inf {ef{y + h) + c{y + h,y)}. 

For any given ?/, let be a point where this inhmum is achieved. Since the function 

/ is Lipschitz, of constant M > 0 say, we have by our assumption on the cost that 

£f{y) - £M\h^\ +5o\he\‘^ < ef{y + K) + c{y + h,, y) < ef{y) 

so that 

,, , M 

\he\ < -^£. 

Oq 

In other words, he tends to zero like e uniformly in y. Also, since / is continuous compactly 
supported, we can hnd (because the cost is nonnegative and large when points are far-apart) 
a bounded open set G, which contains the support of /, such that Qc{,£f){y) > 0 for every 
y G R” \ G. Consequently, we have 

yVcip,{l + eg)dp) > j Qc{ef) {1 + eg)dp> j Qc{ef){l + eg)dp. 

We have, uniformly for y in the bounded set G, 

Qc{£f)iy) = £f{y + he) + c{y + he, y) = ef{y) + eVf{y) ■ he + ^Hyhe ■ he + o{e^) 

and so 

QA^my) > ^f(y) - ■ Vf(y) + o(£^-). 

After multiplying by (1 + eg), we can integrate on G using that the o(£^) is uniform in y: 

\Wc{p,{l + eg)dp)> [ fgdp-]- [ R-^Vf -Vfdp + o(l) 

^ Jn ^ Jn 

This implies, using that G contains the support of /, that 

\imJni^yVc{p,{l + eg)dp)> j fgdp-^ j f - V f dp. 

The result follows by homogeneity (replacing / by A/ and optimizing). □ 
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The linearization given in Lemma [TT] shows that the Brascamp-Lieb inequality follows 
immediately from Proposition [1] for z/ = {l+eg)dij,v with J gdjiv = 0, when £ —)■ 0. Indeed, 
without loss of generality we can assume that D^V > 26o (by adding a small 5o|a;p and 
later making 6q — )■ 0) so that the cost verihes also Cy(x,y) > 5o|a; — yp. Moreover, if V is 
we see from the dehnition (EJ of the cost that, when h —)■ 0 , 


cy (y + h,y)- -D^V{y)h ■ h = 




D^V{y + (1 - t)h) - D^V{y) h-h{l-t)dt= \h\‘^o{l) 


where the o(l) uniform in y on compact sets, since D'^V is uniformly continuous on compact 
sets. On the other hand, if is a compactly supported function with J g dyy = 0, we 
have, 

H{{1 + eg)diJ,v\yv) = j 5 '^ dl^v + o(e^). 

So we find, by applying Proposition [1] with z/ = (1 + eg)dyv ^md the Lemma above with 
the choice f = g (which is the optimal one in the present situation), at the limit, that 

1 iJg^dyvY ^ 1 r 

2 / {D^V{x))-^Vg -Wgdyv- 2 J ^ 


which is the Brascamp-Lieb inequality (Ej). 

Let us apply the same procedure with inequality (IT^ in Theorem [3l the crucial point 
being that Mihi^yv) \{.y + h) —y\) behaves like h(/iy)^|hp for h small. So the cost satishes, 
for h —)■ 0 , 


cv{y + h,y) = Cy{y + h,y) + cM{h{yy) |h|) = Cy{y + h,y) + ch{yyf\h\^ o{h^) 

_ 1 r. 

“ 2 


D^V{y)h ■h + 2c h{yyf Id h-h + o{h^) 


where c is a numerical constant. The same argument as before for z/ = (1 -f- eg)diJ,y shows 
that if is a compactly supported function with f g dyy = 0 and 


J xg{x) dyy{x) = 0 


we have 

J g"^ dyy < J [D‘^V + ch{yyY g ■ Vgdyy 

as claimed in Proposition El 

Finally, let us derive Proposition | 6 l With the notation of the Proposition, for given g, 
denote go '■= g — W ■ Uq — Cq. It is readily checked by elementary calculus that for every 
vector vq and constant Cq (so not only for the ones we have picked), ii g = go + W ■ Vq -|- Cq, 


A:= {D V{x)) Vg{x) ■ Vg{x) dyy{x) - VaTf,^{g) 


-1 


= (D V{x)) Vgo(x) ■Vgo(x)djUy(x) -Varf,^(go) 
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Next, for our choice of vq and cq observe that J godfiy = 0 and 


Xgo{x) dg,{x) = 0 , 


since, in the standard basis, writing xj = x ■ ej for j = 1,... ,n, we have 

J Xj'VV(x) ■ Vq dflv{x) = J ' Xo d^v = Cj ■ Vq = J Xj{g{x) — Co) duvix). 
So by Proposition [5] we hnd 

A > J{D‘^V)~Wgo-VgodiJ,v - J {D'^V + cX{iJ,v)ld)~^Vgo ■ Vgod^v 

= cXifiv) [{D^V)-\D^V + cXMld)-^Vgo-Vgod^iv 


From this bound, we can proceed in two different ways. First, we can use a uniform lower 
bound and combine it with the Brascamp-Lieb inequality. 


A > 


cX(fXv) 


^^Prr -^max (x) + cA(pv') 


(B V) Vgo-Vgod/uv> 


cX(/iv) 


•^max (x) + cX(jUv) 


\go\^ dg,v 


Otherwise, using again that D'^V < Amaxld, we can use Holder’s inequality, to arrive to 

(JIVgoldMF 

j Ajnax(Amax “1“ cA(p.V')) dj^y 


But (1TT|1 implies that J | V^fol d^y > c\/X{iJ,y) J l^fol dfiy for some numerical constant c > 0. 
This ends the proof of Proposition |6l 
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